Context

John Doe remarked in #AP1432 that there may be too much code in our application that isn't used at all. Before the great refactoring offensive in three weeks, it would be great if we could remove the unused code areas. It could be possible that these contain a huge amount of SonarQube issues and we don't want to waste time and money fixing code that's actually not used at all.

Technical Setup

Here are just the dependencies and imports for later data analysis tasks.


In [1]:
%%classpath add mvn
tech.tablesaw tablesaw-beakerx 0.24.5
tech.tablesaw tablesaw-jsplot 0.24.5



In [2]:
%import tech.tablesaw.api.*
%import tech.tablesaw.columns.*
%import tech.tablesaw.plotly.Plot
%import tech.tablesaw.plotly.api.BubblePlot.*
%import static tech.tablesaw.aggregate.AggregateFunctions.*
   
tech.tablesaw.beakerx.TablesawDisplayer.register()
OutputCell.HIDDEN

Idea

Used code

To understand how much code isn't used, we recorded the executed code in production with the coverage tool JaCoCo. The measurement took place between 21st Oct 2017 and 27st Oct 2017. The results were exported into a CSV file using the JaCoCo command line tool with the following command:

java -jar jacococli.jar report "C:\Temp\jacoco.exec" --classfiles \
C:\dev\repos\buschmais-spring-petclinic\target\classes --csv jacoco.csv

The CSV file contains all lines of code that were passed through during the measurement's time span.


In [3]:
coverage = Table.read().csv("datasets/jacoco_production_coverage_spring_petclinic.csv")
coverage.first(5)


We just take the only relevant data LINES_COVERED as well as the information about the packages in PACKAGE as well as the CLASS columns.


In [4]:
coverage = coverage.retainColumns("PACKAGE", "CLASS", "LINE_COVERED")
coverage.first(5)


Technical Debt

To get an impression, how bad our code is, we're running a bunch of static analysis tools via SonarQube. We have a CSV export file from the latest SonarQube static code analysis results.


In [5]:
debt = Table.read().csv("datasets/sonar_issues_spring_petclinic.csv")
debt.first(5)


Aggregation

It was stated that whole packages wouldn't be needed anymore and that they could be safely removed or at least ignored. Therefore, we sum up the coverage data per class as well as the technical debt by their corresponding package accordingly.


In [6]:
coverage_per_package = coverage.summarize("LINE_COVERED", sum).by("PACKAGE");
coverage_per_package



In [7]:
debt_per_package = debt.summarize("debt", sum).by("package");
debt_per_package


Joining

We combined both datasets to get an impression about how much the code in a specific package is used and how much debt is inside such a package.


In [8]:
joined = debt_per_package.join("package").inner(coverage_per_package, "PACKAGE")
joined


Visualisation

We plot the data for the coverage as well as the debt in a XY chart to get a brief overview of the result.


In [9]:
def points = new CategoryPoints(
    value: joined.column('Sum [LINE_COVERED]').asList(),
    size : 15,
    shape : ShapeType.CIRCLE)
new CategoryPlot(
    categoryNames: joined.column('sum [debt]').asList()) << points


Conclusion

The JDBC package org.springframework.samples.petclinic.repository.jdbc isn't used at all and can be left out safely when fixing static code analysis findings. For the rest, we prioritize the refactoring work along the corresponding technical aspects.